On variable sampling frequencies in speech recognition

نویسندگان

  • Fu-Hua Liu
  • Michael Picheny
چکیده

In this paper we describe a novel approach to address the issue of different sampling frequencies in speech recognition. In general, when a recognition task needs a different sampling frequency from that of the reference system, it is customary to retrain the system for the new sampling rate. To circumvent the tedious training process, we propose a new approach termed Sampling Rate Transformation (SRT) to perform the transformation directly on speech recognition system. By re-scaling the mel-filter design and filtering the system in spectrum domain, SRT converts the existing system to the target spectral range. New systems are obtained without using any data from the test environment. Preliminary experiments show that SRT reduces the word error rate from 29.89% to 18.17% given 11KHz test data and a 16KHz SI system. The matched system for 11KHz has an error rate of 16.17%. We also examine MLLR and MAP. The best result from MLLR is 17.92% with 4.5 hours of speech. In the speaker adaptation mode, SRT reduces the error rate from 15.48% to 9.71% given 11KHz test data and a 16KHz SA system while the matched 11KHz SA system has an error rate of 9.33%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language

Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...

متن کامل

On the advantage of frequency-filtering features for speech recognition with variable sampling frequencies. experiments with speechdatcar databases

When a speech recognition system has to work with signals corresponding to different sampling frequencies, multiple acoustic models may have to be maintained. To avoid this drawback, the system can be trained at the highest expected sampling frequency and the acoustic models are posteriorly converted to a new sampling frequency. However, the usual mel-frequency cepstral coefficients are not wel...

متن کامل

Signal Modeling with Non-uniform Time Sampling of Features for Automatic Speech Recognition

SIGNAL MODELING WITH NON-UNIFORM TIME SAMPLING OF FEATURES FOR AUTOMATIC SPEECH RECOGNITION Montri Karnjanadecha Old Dominion University, 2000 Director: Dr. Stephen A. Zahorian This dissertation presents an investigation of nonuniform time sampling methods for spectral/temporal feature extraction in speech. Frame-based features were computed based on an encoding of the global spectral shape usi...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Viterbi decoding for latent words language models using gibbs sampling

This paper introduces a new approach that directly uses latent words language models (LWLMs) in automatic speech recognition (ASR). LWLMs are effective against data sparseness because of their soft-decision clustering structure and Bayesian modeling so it can be expected that LWLMs perform robustly in multiple ASR tasks. Unfortunately, implementing a LWLM to ASR is difficult because of its comp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998